We will clean the environment, setup the locations, define colors, and create a datestamp.
Clean the environment.
Set locations and working directories…
Create a new analysis directory...
[1] FALSE
[1] FALSE
[1] FALSE
[1] FALSE
[1] FALSE
[1] FALSE
[1] "/Users/swvanderlaan/PLINK/analyses/consortia/CHARGE_1000G_CAC"
[1] "_archived" "1. CHARGE_1000G_CAC.nb.html"
[3] "1. CHARGE_1000G_CAC.Rmd" "2. bulkRNAseq.nb.html"
[5] "2. bulkRNAseq.Rmd" "20220128.CAC.RegionalAssociationPlots.RData"
[7] "20220129.CAC.Parsing_GWASSumStats.RData" "20220131.CAC.PolarMorphism.RData"
[9] "20220201.CAC.PolarMorphism.RData" "20220202.CAC.PolarMorphism.RData"
[11] "3. scRNAseq.nb.html" "3. scRNAseq.Rmd"
[13] "4. Parsing_GWASSumStats.nb.html" "4. Parsing_GWASSumStats.Rmd"
[15] "5. RegionalAssociationPlots.nb.html" "5. RegionalAssociationPlots.Rmd"
[17] "6. PolarMorphism.nb.html" "6. PolarMorphism.Rmd"
[19] "CAC" "CHARGE_1000G_CAC.Rproj"
[21] "CredibleSets" "images"
[23] "LICENSE" "PolarMorphism"
[25] "RACER" "README.html"
[27] "README.md" "README.orig.md"
[29] "renv" "renv.lock"
[31] "scripts" "SNP"
[33] "targets"
… a package-installation function …
source(paste0(PROJECT_loc, "/scripts/functions.R"))… and load those packages.
install.packages.auto("readr")
install.packages.auto("optparse")
install.packages.auto("tools")
install.packages.auto("dplyr")
install.packages.auto("tidyr")
install.packages.auto("naniar")
# To get 'data.table' with 'fwrite' to be able to directly write gzipped-files
# Ref: https://stackoverflow.com/questions/42788401/is-possible-to-use-fwrite-from-data-table-with-gzfile
# install.packages("data.table", repos = "https://Rdatatable.gitlab.io/data.table")
library(data.table)
install.packages.auto("tidyverse")
install.packages.auto("knitr")
install.packages.auto("DT")
install.packages.auto("eeptools")
install.packages.auto("haven")
install.packages.auto("tableone")
install.packages.auto("BlandAltmanLeh")
# Install the devtools package from Hadley Wickham
install.packages.auto('devtools')
library(devtools)
# for plotting
install.packages.auto("pheatmap")
install.packages.auto("forestplot")
install.packages.auto("ggplot2")
install.packages.auto("ggpubr")
install.packages.auto("ggrepel")
install.packages.auto("UpSetR")
devtools::install_github("thomasp85/patchwork")Using github PAT from envvar GITHUB_PAT
Skipping install of 'patchwork' from a github remote, the SHA1 (79223d30) has not changed since last install.
Use `force = TRUE` to force installation
# For regional association plots
install_github("oliviasabik/RACER") Using github PAT from envvar GITHUB_PAT
Skipping install of 'RACER' from a github remote, the SHA1 (1394c9d4) has not changed since last install.
Use `force = TRUE` to force installation
# Install ggrepel package if needed
library(ggrepel)
# install ggsci
install.packages.auto("ggsci")
# plotly
install.packages.auto("plotly")We will create a datestamp and define the Utrecht Science Park Colour Scheme.
Today = format(as.Date(as.POSIXlt(Sys.time())), "%Y%m%d")
Today.Report = format(as.Date(as.POSIXlt(Sys.time())), "%A, %B %d, %Y")
### UtrechtScienceParkColoursScheme
###
### WebsitetoconvertHEXtoRGB:http://hex.colorrrs.com.
### Forsomefunctionsyoushoulddividethesenumbersby255.
###
### No. Color HEX (RGB) CHR MAF/INFO
###---------------------------------------------------------------------------------------
### 1 yellow #FBB820 (251,184,32) => 1 or 1.0>INFO
### 2 gold #F59D10 (245,157,16) => 2
### 3 salmon #E55738 (229,87,56) => 3 or 0.05<MAF<0.2 or 0.4<INFO<0.6
### 4 darkpink #DB003F ((219,0,63) => 4
### 5 lightpink #E35493 (227,84,147) => 5 or 0.8<INFO<1.0
### 6 pink #D5267B (213,38,123) => 6
### 7 hardpink #CC0071 (204,0,113) => 7
### 8 lightpurple #A8448A (168,68,138) => 8
### 9 purple #9A3480 (154,52,128) => 9
### 10 lavendel #8D5B9A (141,91,154) => 10
### 11 bluepurple #705296 (112,82,150) => 11
### 12 purpleblue #686AA9 (104,106,169) => 12
### 13 lightpurpleblue #6173AD (97,115,173/101,120,180) => 13
### 14 seablue #4C81BF (76,129,191) => 14
### 15 skyblue #2F8BC9 (47,139,201) => 15
### 16 azurblue #1290D9 (18,144,217) => 16 or 0.01<MAF<0.05 or 0.2<INFO<0.4
### 17 lightazurblue #1396D8 (19,150,216) => 17
### 18 greenblue #15A6C1 (21,166,193) => 18
### 19 seaweedgreen #5EB17F (94,177,127) => 19
### 20 yellowgreen #86B833 (134,184,51) => 20
### 21 lightmossgreen #C5D220 (197,210,32) => 21
### 22 mossgreen #9FC228 (159,194,40) => 22 or MAF>0.20 or 0.6<INFO<0.8
### 23 lightgreen #78B113 (120,177,19) => 23/X
### 24 green #49A01D (73,160,29) => 24/Y
### 25 grey #595A5C (89,90,92) => 25/XY or MAF<0.01 or 0.0<INFO<0.2
### 26 lightgrey #A2A3A4 (162,163,164) => 26/MT
###
### ADDITIONAL COLORS
### 27 midgrey #D7D8D7
### 28 verylightgrey #ECECEC"
### 29 white #FFFFFF
### 30 black #000000
###----------------------------------------------------------------------------------------------
uithof_color = c("#FBB820","#F59D10","#E55738","#DB003F","#E35493","#D5267B",
"#CC0071","#A8448A","#9A3480","#8D5B9A","#705296","#686AA9",
"#6173AD","#4C81BF","#2F8BC9","#1290D9","#1396D8","#15A6C1",
"#5EB17F","#86B833","#C5D220","#9FC228","#78B113","#49A01D",
"#595A5C","#A2A3A4", "#D7D8D7", "#ECECEC", "#FFFFFF", "#000000")
uithof_color_legend = c("#FBB820", "#F59D10", "#E55738", "#DB003F", "#E35493",
"#D5267B", "#CC0071", "#A8448A", "#9A3480", "#8D5B9A",
"#705296", "#686AA9", "#6173AD", "#4C81BF", "#2F8BC9",
"#1290D9", "#1396D8", "#15A6C1", "#5EB17F", "#86B833",
"#C5D220", "#9FC228", "#78B113", "#49A01D", "#595A5C",
"#A2A3A4", "#D7D8D7", "#ECECEC", "#FFFFFF", "#000000")
### ----------------------------------------------------------------------------We will parse the data to create regional association plots for each of the 11 loci.
library("scales")
Attaching package: ‘scales’
The following object is masked from ‘package:purrr’:
discard
The following object is masked from ‘package:readr’:
col_factor
pal_npg("nrc")(10) [1] "#E64B35FF" "#4DBBD5FF" "#00A087FF" "#3C5488FF" "#F39B7FFF" "#8491B4FF" "#91D1C2FF" "#DC0000FF"
[9] "#7E6148FF" "#B09C85FF"
show_col(pal_npg("nrc")(10))
# show_col(pal_npg("nrc", alpha = 0.6)(10))We need to load the data first.
gwas_sumstats_racer <- readRDS(file = paste0(OUT_loc, "/gwas_sumstats_racer.rds"))We are interested in 11 top loci.
library(openxlsx)
variant_list <- read.xlsx(paste0(TARGET_loc, "/Variants.xlsx"), sheet = "TopLoci")
DT::datatable(variant_list)NALet’s do some plotting.
library(RACER)
# Make directory for plots
ifelse(!dir.exists(file.path(PROJECT_loc, "/RACER")),
dir.create(file.path(PROJECT_loc, "/RACER")),
FALSE)[1] FALSE
RACER_loc = paste0(PROJECT_loc,"/RACER")
variants_of_interest <- c(variant_list$rsID)
variants_of_interest_fewgenes <- c("rs9349379", "rs3844006", "rs2854746", "rs4977575", "rs9633535", "rs11063120", "rs9515203", "rs7182103")
for(VARIANT in variants_of_interest){
cat(paste0("Getting data for ", VARIANT,".\n"))
tempCHR <- subset(variant_list, rsID == VARIANT)[,5]
tempSTART <- subset(variant_list, rsID == VARIANT)[,17]
tempEND <- subset(variant_list, rsID == VARIANT)[,18]
tempVARIANTnr <- subset(variant_list, rsID == VARIANT)[,1]
cat("\nSubset required data.\n")
temp <- subset(gwas_sumstats_racer, Chr == tempCHR & (Position >= tempSTART & Position <= tempEND))
cat("\nFormatting association data.\n")
temp_f = RACER::formatRACER(assoc_data = temp, chr_col = 3, pos_col = 4, p_col = 5)
cat("\nGetting LD data.\n")
temp_f_ld = RACER::ldRACER(assoc_data = temp_f, rs_col = 2, pops = "EUR", lead_snp = VARIANT)
cat(paste0("\nPlotting region surrounding ", VARIANT," on ",tempCHR,":",tempSTART,"-",tempEND,".\n"))
# source(paste0(PROJECT_loc, "/scripts/functions.R"))
p1 <- singlePlotRACER2(assoc_data = temp_f_ld,
chr = tempCHR, build = "hg19",
plotby = "snp", snp_plot = VARIANT,
label_lead = TRUE, gene_track_h = 2, gene_name_s = 1.75)
print(p1)
cat(paste0("Saving image for ", VARIANT,".\n"))
# ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.png"), plot = last_plot())
# ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.pdf"), plot = last_plot())
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.eps"), plot = last_plot())
rm(temp, p1,
temp_f, temp_f_ld,
tempCHR, tempSTART, tempEND,
VARIANT, tempVARIANTnr)
}Getting data for rs9349379.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs9349379...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs9349379&pop=EUR&r2_d=r2&token=c0f613f149ab"
[45%] Downloaded 32210 bytes...
[100%] Downloaded 70967 bytes...
Merging input association data with LD...
Plotting region surrounding rs9349379 on 6:12403957-13403957.
Plotting by...
snp rs9349379
Reading in association data
Generating Plot
Saving image for rs9349379.
Saving 7.29 x 4.51 in image
Getting data for rs3844006.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs3844006...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs3844006&pop=EUR&r2_d=r2&token=c0f613f149ab"
[100%] Downloaded 31016 bytes...
Merging input association data with LD...
Plotting region surrounding rs3844006 on 6:131595002-132595002.
Plotting by...
snp rs3844006
Reading in association data
Generating Plot
Saving image for rs3844006.
Saving 7.29 x 4.51 in image
Getting data for rs2854746.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs2854746...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs2854746&pop=EUR&r2_d=r2&token=c0f613f149ab"
[44%] Downloaded 48575 bytes...
[52%] Downloaded 57344 bytes...
[67%] Downloaded 73710 bytes...
[100%] Downloaded 109298 bytes...
Merging input association data with LD...
Plotting region surrounding rs2854746 on 7:45460645-46460645.
Plotting by...
snp rs2854746
Reading in association data
Generating Plot
Saving image for rs2854746.
Saving 7.29 x 4.51 in image
Getting data for rs4977575.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs4977575...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs4977575&pop=EUR&r2_d=r2&token=c0f613f149ab"
[100%] Downloaded 26000 bytes...
Merging input association data with LD...
Plotting region surrounding rs4977575 on 9:21624744-22624744.
Plotting by...
snp rs4977575
Reading in association data
Generating Plot
Saving image for rs4977575.
Saving 7.29 x 4.51 in image
Getting data for rs10899970.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs10899970...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs10899970&pop=EUR&r2_d=r2&token=c0f613f149ab"
[44%] Downloaded 71861 bytes...
[55%] Downloaded 88263 bytes...
[65%] Downloaded 104629 bytes...
[75%] Downloaded 121031 bytes...
[95%] Downloaded 153799 bytes...
[100%] Downloaded 160476 bytes...
Merging input association data with LD...
Plotting region surrounding rs10899970 on 10:44015716-45334720.
Plotting by...
snp rs10899970
Reading in association data
Generating Plot
Saving image for rs10899970.
Saving 7.29 x 4.51 in image
Getting data for rs9633535.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs9633535...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs9633535&pop=EUR&r2_d=r2&token=c0f613f149ab"
[100%] Downloaded 49363 bytes...
Merging input association data with LD...
Plotting region surrounding rs9633535 on 10:63336088-64336088.
Plotting by...
snp rs9633535
Reading in association data
Generating Plot
Saving image for rs9633535.
Saving 7.29 x 4.51 in image
Getting data for rs10762577.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs10762577...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs10762577&pop=EUR&r2_d=r2&token=c0f613f149ab"
[77%] Downloaded 71862 bytes...
[100%] Downloaded 92997 bytes...
Merging input association data with LD...
Plotting region surrounding rs10762577 on 10:75417431-76417431.
Plotting by...
snp rs10762577
Reading in association data
Generating Plot
Saving image for rs10762577.
Saving 7.29 x 4.51 in image
Getting data for rs11063120.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs11063120...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs11063120&pop=EUR&r2_d=r2&token=c0f613f149ab"
[100%] Downloaded 41883 bytes...
Merging input association data with LD...
Plotting region surrounding rs11063120 on 12:3986618-4986618.
Plotting by...
snp rs11063120
Reading in association data
Generating Plot
Saving image for rs11063120.
Saving 7.29 x 4.51 in image
Getting data for rs9515203.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs9515203...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs9515203&pop=EUR&r2_d=r2&token=c0f613f149ab"
[100%] Downloaded 30139 bytes...
Merging input association data with LD...
Plotting region surrounding rs9515203 on 13:110549623-111549623.
Plotting by...
snp rs9515203
Reading in association data
Generating Plot
Saving image for rs9515203.
Saving 7.29 x 4.51 in image
Getting data for rs7182103.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs7182103...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs7182103&pop=EUR&r2_d=r2&token=c0f613f149ab"
[74%] Downloaded 87861 bytes...
[88%] Downloaded 104263 bytes...
[100%] Downloaded 118299 bytes...
Merging input association data with LD...
Plotting region surrounding rs7182103 on 15:78623946-79623946.
Plotting by...
snp rs7182103
Reading in association data
Generating Plot
Saving image for rs7182103.
Saving 7.29 x 4.51 in image
Getting data for rs7412.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs7412...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs7412&pop=EUR&r2_d=r2&token=c0f613f149ab"
[41%] Downloaded 15980 bytes...
[60%] Downloaded 23496 bytes...
[100%] Downloaded 38865 bytes...
Merging input association data with LD...
Plotting region surrounding rs7412 on 19:44912079-45912079.
Plotting by...
snp rs7412
Reading in association data
Generating Plot
Saving image for rs7412.
Saving 7.29 x 4.51 in image
variants_of_interest_manygenes <- c("rs7412", "rs10762577")
source(paste0(PROJECT_loc, "/scripts/functions.R"))
for(VARIANT in variants_of_interest_manygenes){
cat(paste0("Getting data for ", VARIANT,".\n"))
tempCHR <- subset(variant_list, rsID == VARIANT)[,5]
tempSTART <- subset(variant_list, rsID == VARIANT)[,17]
tempEND <- subset(variant_list, rsID == VARIANT)[,18]
tempVARIANTnr <- subset(variant_list, rsID == VARIANT)[,1]
cat("\nSubset required data.\n")
temp <- subset(gwas_sumstats_racer, Chr == tempCHR & (Position >= tempSTART & Position <= tempEND))
cat("\nFormatting association data.\n")
temp_f = RACER::formatRACER(assoc_data = temp, chr_col = 3, pos_col = 4, p_col = 5)
cat("\nGetting LD data.\n")
temp_f_ld = RACER::ldRACER(assoc_data = temp_f, rs_col = 2, pops = "EUR", lead_snp = VARIANT)
cat(paste0("\nPlotting region surrounding ", VARIANT," on ",tempCHR,":",tempSTART,"-",tempEND,".\n"))
p1 <- singlePlotRACER2(assoc_data = temp_f_ld,
chr = tempCHR, build = "hg19",
plotby = "snp", snp_plot = VARIANT,
label_lead = TRUE, gene_track_h = 0.75, gene_name_s = 1.75)
print(p1)
cat(paste0("Saving image for ", VARIANT,".\n"))
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.png"), plot = last_plot())
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.pdf"), plot = last_plot())
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.eps"), plot = last_plot())
rm(temp, p1,
temp_f, temp_f_ld,
tempCHR, tempSTART, tempEND,
VARIANT, tempVARIANTnr)
}Getting data for rs7412.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs7412...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs7412&pop=EUR&r2_d=r2&token=c0f613f149ab"
[41%] Downloaded 15980 bytes...
[60%] Downloaded 23496 bytes...
[100%] Downloaded 38865 bytes...
Merging input association data with LD...
Plotting region surrounding rs7412 on 19:44912079-45912079.
Plotting by...
snp rs7412
Reading in association data
Generating Plot
Saving image for rs7412.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs10762577.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs10762577...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs10762577&pop=EUR&r2_d=r2&token=c0f613f149ab"
[76%] Downloaded 71496 bytes...
[94%] Downloaded 87862 bytes...
[100%] Downloaded 92997 bytes...
Merging input association data with LD...
Plotting region surrounding rs10762577 on 10:75417431-76417431.
Plotting by...
snp rs10762577
Reading in association data
Generating Plot
Saving image for rs10762577.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
variants_of_interest_cxcl12 <- c("rs10899970")
source(paste0(PROJECT_loc, "/scripts/functions.R"))
for(VARIANT in variants_of_interest_cxcl12){
cat(paste0("Getting data for ", VARIANT,".\n"))
tempCHR <- subset(variant_list, rsID == VARIANT)[,5]
tempSTART <- subset(variant_list, rsID == VARIANT)[,17]
tempEND <- subset(variant_list, rsID == VARIANT)[,18]
tempVARIANTnr <- subset(variant_list, rsID == VARIANT)[,1]
cat("\nSubset required data.\n")
temp <- subset(gwas_sumstats_racer, Chr == tempCHR & (Position >= tempSTART & Position <= tempEND))
cat("\nFormatting association data.\n")
temp_f = RACER::formatRACER(assoc_data = temp, chr_col = 3, pos_col = 4, p_col = 5)
cat("\nGetting LD data.\n")
temp_f_ld = RACER::ldRACER(assoc_data = temp_f, rs_col = 2, pops = "EUR", lead_snp = VARIANT)
cat(paste0("\nPlotting region surrounding ", VARIANT," on ",tempCHR,":",tempSTART,"-",tempEND,".\n"))
p1 <- singlePlotRACER2(assoc_data = temp_f_ld,
chr = tempCHR, build = "hg19", set = "all",
plotby = "snp", snp_plot = VARIANT,
label_lead = TRUE, gene_track_h = 0.75, gene_name_s = 1.75)
print(p1)
cat(paste0("Saving image for ", VARIANT,".\n"))
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.png"), plot = last_plot())
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.pdf"), plot = last_plot())
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.eps"), plot = last_plot())
rm(temp, p1,
temp_f, temp_f_ld,
tempCHR, tempSTART, tempEND,
VARIANT, tempVARIANTnr)
}Getting data for rs10899970.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs10899970...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs10899970&pop=EUR&r2_d=r2&token=c0f613f149ab"
[40%] Downloaded 65518 bytes...
[81%] Downloaded 131054 bytes...
[91%] Downloaded 147456 bytes...
[100%] Downloaded 160476 bytes...
Merging input association data with LD...
Plotting region surrounding rs10899970 on 10:44015716-45334720.
Plotting by...
snp rs10899970
Reading in association data
Generating Plot
Saving image for rs10899970.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
We want to create some regional association plots to combine with teh UCSC browser tracks, thus we need the exact same regions.
library(openxlsx)
add_list <- read.xlsx(paste0(TARGET_loc, "/Variants.xlsx"), sheet = "AdditionalPlots")
DT::datatable(add_list)NAWe want to color the credible sets, which we load here.
credset <- as_tibble(fread(paste0(PROJECT_loc, "/CredibleSets/CAC_EUR_AFR_cred_set_all_loci_50kb.txt")))
credsetWe want to add the posterior probabilities and make a variable to color by.
gwas_sumstats_racer_credset <- merge(gwas_sumstats_racer,
credset %>% select(RSID, Posterior_Prob),
sort = FALSE,
by.x = "rsID", by.y = "RSID", all.x = TRUE) %>%
# mutate(., Posterior_Prob = ifelse(is.na(Posterior_Prob), 0, Posterior_Prob)) %>%
mutate(CredSet = case_when(Posterior_Prob > 0 ~ '95% credible set',
TRUE ~ 'not in credible set'))
head(gwas_sumstats_racer_credset)
table(gwas_sumstats_racer_credset$CredSet)
95% credible set not in credible set
103 8585944
summary(gwas_sumstats_racer_credset$Posterior_Prob) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0 0 0 0 0 1 8585944
library(RACER)
# library(plotly)
# Make directory for plots
ifelse(!dir.exists(file.path(PROJECT_loc, "/RACER")),
dir.create(file.path(PROJECT_loc, "/RACER")),
FALSE)[1] FALSE
RACER_loc = paste0(PROJECT_loc,"/RACER")
variants_of_interest <- c(add_list$rsID)
for(VARIANT in variants_of_interest){
cat(paste0("Getting data for ", VARIANT,".\n"))
tempCHR <- subset(add_list, rsID == VARIANT)[,4]
tempSTART <- subset(add_list, rsID == VARIANT)[,5]
tempEND <- subset(add_list, rsID == VARIANT)[,6]
tempNAME <- subset(add_list, rsID == VARIANT)[,3]
cat("\nSubset required data.\n")
temp <- subset(gwas_sumstats_racer_credset, Chr == tempCHR & (Position >= tempSTART & Position <= tempEND))
cat("\nFormatting association data.\n")
temp_f = RACER::formatRACER(assoc_data = temp, chr_col = 3, pos_col = 4, p_col = 5)
cat("\nGetting LD data.\n")
# temp_f_ld = RACER::ldRACER(assoc_data = temp_f, rs_col = 2, pops = "EUR", lead_snp = VARIANT)
cat(paste0("\nPlotting region surrounding ", VARIANT," on ",tempCHR,":",tempSTART,"-",tempEND,".\n"))
p1 <- singlePlotRACER2(assoc_data = temp_f,
chr = tempCHR, build = "hg19",
plotby = "coord", snp_plot = VARIANT,
start_plot = tempSTART, end_plot = tempEND,
label_lead = FALSE,
grey_colors = FALSE,
cred_set = TRUE,
gene_track_h = 3, gene_name_s = 1.75)
print(p1)
cat(paste0("Saving image for ", VARIANT,".\n"))
ggsave(filename = paste0(RACER_loc, "/", tempNAME, ".", Today, ".",VARIANT,".",tempSTART,".",tempEND,".regional_assoc.png"), plot = p1)
ggsave(filename = paste0(RACER_loc, "/", tempNAME, ".", Today, ".",VARIANT,".",tempSTART,".",tempEND,".regional_assoc.pdf"), plot = p1)
ggsave(filename = paste0(RACER_loc, "/", tempNAME, ".", Today, ".",VARIANT,".",tempSTART,".",tempEND,".regional_assoc.eps"), plot = p1)
# print(ggplotly(p1))
rm(temp, p1,
temp_f,
tempCHR, tempSTART, tempEND,
VARIANT, tempNAME)
}Getting data for rs9633535.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
Plotting region surrounding rs9633535 on 10:63584853-63921073.
Association Data Set is missing LD data, the resulting plot won't have LD information, but you can add it using the ldRACER.R function.
Plotting by...
coord
Reading in association data
Collecting posterior probabilities
Generating Plot
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Saving image for rs9633535.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs2854746.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
Plotting region surrounding rs2854746 on 7:45894617-46054070.
Association Data Set is missing LD data, the resulting plot won't have LD information, but you can add it using the ldRACER.R function.
Plotting by...
coord
Reading in association data
Collecting posterior probabilities
Generating Plot
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Saving image for rs2854746.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs3844006.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
Plotting region surrounding rs3844006 on 6:131937915-132289374.
Association Data Set is missing LD data, the resulting plot won't have LD information, but you can add it using the ldRACER.R function.
Plotting by...
coord
Reading in association data
Collecting posterior probabilities
Generating Plot
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Saving image for rs3844006.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Version: v1.3.0
Last update: 2022-02-03
Written by: Sander W. van der Laan (s.w.vanderlaan-2[at]umcutrecht.nl).
Description: Script to create plot regional association plots.
Minimum requirements: R version 3.4.3 (2017-06-30) -- 'Single Candle', Mac OS X El Capitan
Changes log
* v1.3.0 Added the credible sets to the aditional regions.
* v1.2.0 Added in aditional regions.
* v1.1.0 Created PNG and PDF of top loci regions.
* v1.0.0 Initial version.
sessionInfo()
save.image(paste0(PROJECT_loc, "/",Today,".",PROJECTNAME,".RegionalAssociationPlots.RData"))| © 1979-2022 Sander W. van der Laan | s.w.vanderlaan[at]gmail.com | swvanderlaan.github.io. |